install.packages("tibble")RToday we will…
RR + RStudioHi, I’m Dr. Rehnberg!
I am a transplant to the west coast – PA to MO to MI to CA.
My favorite things are being outside, drinking tea, and watching reality tv.
I love R and I’m excited to share that with you this quarter!
I have a genetic, degenerative eye disease called Stargardt disease, which causes me to have poor vision, even with corrective lenses.
What this means for you:
When I am helping you on your computer, please make the font large and turn the brightness up.
I have difficulty recognizing faces – please be patient!
Questions?
We will be joined in class by Libby.
Libby is…
A second-year Statistics major pursuing a Data Science minor.
Originally from the Bay Area.
A puzzler, traveler, and volunteer helping to raise and train guide dogs!
I am looking forward to reading your introductions on Canvas Discussions!
RR?R is a programming language designed originally for statistical analyses.R was created by Ross Ihaka and Robert Gentleman in 1993.
R.R was formally released by the R Core Group in 1997.
R.R’s strengths are…
… handling data with lots of different types of variables.
… making nice and complex data visualizations.
… having cutting-edge statistical methods available to users.
R’s weaknesses are…
… performing non-analysis programming tasks, like website creation (python, ruby, …).
… hyper-efficient numerical computation (matlab, C, …).
… being a simple tool for all audiences (SPSS, STATA, JMP, …).
The heart and soul of R are packages.
R when installed.R packages live on the Comprehensive R Archive Network, or CRAN.To install a package use:
Importantly, R is open-source.
R, like there is for SAS or Matlab.
R code!
This means packages are created by users like you and me!
RStudio is an IDE (Integrated Developer Environment).
R.A directory is just a fancy name for a folder.
Your working directory is the folder that R “thinks” it lives in at the moment.
[1] "/Users/zrehnber/Documents/Teaching/Stat_331/material/lecture_slides/W1_intro_R"
This file lives in my user files Users/…
…on my account zrehnber/ …
…in my Documents folder …
…in a series of organized folders.
Create a directory for this class!
Is it in a place you can easily find it?
Does it have an informative name?
Are the files inside it well-organized?
An R Project is basically a “flag” planted in a certain directory.
When you double click an .Rproj file, it:
Opens RStudio
Sets the working directory to be wherever the .Rproj file lives.
Links to GitHub, if set up (more on that later!)
R Projects are great for reproducibility!
You can send anyone your folder with your .Rproj file and they will be able to run your code on their computer.
We will be using R Projects throughout this course.
You can to send your project to someone else, and they can jump in and start working right away.
This means:
Files are organized and well-named.
References to data and code work for everyone.
Package dependency is clear.
Code will run the same every time, even if data values change.
Analysis process is well-explained and easy to read.
/User/zrehnber/Stat331/lab1/Desktop/stuff/Stat331 as one project and not each assignment as a new project. Do not put projects within projects!If you put something like this at the top of your .qmd file (more on Quarto later), I will set your computer on fire:
Setting working directory by hand = BAD!
That directory is specific to you!
Quarto ignores this code when knitting!
R BasicsA value is a basic unit of stuff that a program works with.
Values have types:
Variables are names that refer to values.
A variable is like a container that holds something - when you refer to the container, you get whatever is stored inside.
We assign values to variables using the syntax object_name <- value.
Homogeneous: every element has the same data type.
Vector: a one-dimensional column of homogeneous data.
Matrix: a two-dimensional set of homogeneous data arranged in a rectangular format.
Heterogeneous: the elements can be of different types.
List: a one-dimensional column of heterogeneous data.
Dataframe: a two-dimensional set of heterogeneous data arranged in a rectangular format.
We use square brackets ([]) to access elements within data structures.
R, we start indexing from 1.We can combine logical statements using and, or, and not.
(X AND Y) requires that both X and Y are true.
(X OR Y) requires that one of X or Y is true.
(NOT X) is true if X is false, and false if X is true.
Just because you see scary red text, this does not mean something went wrong! This is just R communicating with you.
Often, R will give you a warning.
This means that your code did run…
…but you probably want to make sure it succeeded.
Does this look right?
If the word Error appears in your message from R, then you have a problem.
seq(from = 1, to = 10, by = 1
Error: <text>:2:0: unexpected end of input
1: seq(from = 1, to = 10, by = 1
^
seq(from = 1, to = 10 by = 1)
sequence(from = 1, to = 10, by = 1)
Error in sequence.default(from = 1, to = 10, by = 1): argument "nvec" is missing, with no default
sqrt(‘1’)
Error in my_obj(5): could not find function "my_obj"
R says…Error: Object
some_objnot found.
R says…Error: Object of type ‘closure’ is not subsettable.
R says…Error: Non-numeric argument to binary operator.
Look at the help file for the function!
When all else fails, Google your error message.
Leave out the specifics.
Include the function you are using.
What’s wrong here?
Error in matrix(c("a", "b", "c", "d"), num_row = 2): unused argument (num_row = 2)
Hint: try typing ?matrix into your console to view the documentation for this function.
File > New File > R Script) are files of code that are meant to be run on their own.Scripts can be run in RStudio by clicking the Run button at the top of the editor window when the script is open.
You can also run code interactively in a script by:
highlighting lines of code and hitting run.
placing your cursor on a line of code and hitting run.
placing your cursor on a line of code and hitting ctrl + enter or command + enter.
Notebooks are an implementation of literate programming.
They allow you to integrate code, output, text, images, etc. into a single document.
E.g.,
Reproducibility!
Markdown (without the “R”) is a markup language.
It uses special symbols and formatting to make pretty documents.
Markdown files have the .md extension.
Quarto uses Markdown, AND it can run and display R code.
Quarto makes moving between outputs straightforward.
A few useful tips for formatting the Markdown text in your document:
R Code Options in QuartoR code chunk options are included at the top of each code chunk, prefaced with a #| (hashpipe).
To take your .qmd file and make it look pretty, you have to render it.
Quarto CLI (command line interface) orchestrates each step of rendering:
knitr.When you click Render:
R code written in your .qmd file gets run in order.
Part One:
This file has many mistakes in the code. Some are errors that will prevent the file from knitting; some are mistakes that do NOT result in an error.
Fix all the problems in the code chunks.
Part Two:
Follow the instructions in the file to uncover a secret message.
Submit the name of the poem as the answer to the Canvas quiz question.
In this lab, you will begin to familiarize yourself with working in and customizing Quarto documents.